x
, the standardized value z
is calculated as .Imagine a dataset for house pricing, with features like size
(in square feet) and number of bedrooms
. These features are on different scales. If you standardize these features, each will contribute equally to the distance calculations in a model, like k-NN.
x
, the normalized value x'
is calculated as .Consider a neural network processing images, where each pixel intensity ranges from 0 to 255. Normalizing these intensities to a range of 0 to 1 can make the network train more efficiently.
Standardization is particularly useful when the features in your dataset have different units of measurement or vastly different scales. It's commonly used in algorithms that are sensitive to the variance in data, like Support Vector Machines (SVMs) and k-Nearest Neighbors (k-NN).
Normalization is essential when your data needs to be on the same scale, such as in the case of neural networks, which often require input data to be normalized. Itβs also useful for image data processing where pixel intensities need to be normalized.
Property | Standardization | Normalization |
---|---|---|
Definition | Rescales data to have a mean of 0 and a standard deviation of 1. | Rescales data to a fixed range, typically 0 to 1. |
Formula | ||
Scale | No fixed range; values centered around 0. | Fixed range (e.g., 0 to 1). |
Sensitivity to Outliers | Less sensitive to outliers. | More sensitive to outliers. |
Typical Use Cases | Models sensitive to the scale of input data (e.g., SVM, k-NN, Neural Networks). | Data bounded within a range, like pixel intensities in image processing. |
Where to Use | In algorithms that assume data is centered and standardized, especially when features have different units/scales. | When data needs to be normalized to a specific scale, such as in neural network algorithms requiring input normalization. |
Where Not to Use | May not be necessary for tree-based algorithms that are invariant to the scale of features. | Not ideal when the original distribution of the data is important or when outliers are critical to the analysis. |
Get the latest updates, exclusive content and special offers delivered directly to your mailbox. Subscribe now!